Data Mining And Data Warehousing
Reconstruction-based Outlier Detection using Auto Encoder
Reconstruction-based Outlier Detection using Autoencoders
Idea of Reconstruction-based Methods
The assumptions made are as follows:
Normal data has regular patterns that a model can learn. Outliers deviate from these patterns and are harder to reconstruct.
So, if we train a model to reconstruct input data, the reconstruction error will be low for normal points and high for outliers.
An autoencoder is a type of neural network with two parts:
- Encoder: Compresses input x into a lower-dimensional latent space z.
- Decoder: Reconstructs the original input x_mean from z.
- Trained to minimize reconstruction loss, usually Mean Squared Error (MSE):
(x - x_mean)2.
Using it for Outlier Detection
- Train the autoencoder on normal data (without outliers).
- For any new data point, p:
- Pass p it through the autoencoder.
- Compute reconstruction error, E = (x - x_mean)2
- If error E > threshold --> label p as outlier.